Stochastic Approaches to Morphology Acquisition
نویسندگان
چکیده
One of the first steps in acquiring a morphology system is discovering which phonetic strings correspond to morphemes. These phonetic strings can then be further analyzed in order to determine their grammatical privileges and contribution to meaning and thus to bootstrap into a functional morphology system. Discovering the relevant phonetic strings is a deceptively easy task. Morpheme discovery presents a number of difficulties that are above and beyond those that occur for the similar task of word discovery and segmentation. Although both require the segmenting of a continuous speech stream, word segmentation can take advantage of the fact that some words are spoken in isolation, and those words can be used to bootstrap into the segmentation of other words. Although this will work for some morphemes (many words are monomorphemic), grammatical morphemes are often bound in many languages, such as English and Spanish, and thus never heard in isolation. Additionally, there is no simple strategy that will universally work for breaking a word into its component morphemes. Although in many languages grammatical morphemes are either at the beginning or the end of a word, simply using an approach whereby the child assumes that the first or last syllable is a morpheme will only work if that assumption aligns with the language environment that the child is exposed to. Since affixing languages of the world can have (multiple) prefixes, suffixes, and infixes, such an approach is likely to fail. Additionally, acquiring grammatical morphemes is much like acquiring function words; unlike nouns, function words have little concrete semantic meaning, likely contributing to the difficulty in learning these types of words (Bird et al. 2001, Caselli et al. 1995, Gentner 1982, Morrison et al. 1997). The search for morpheme forms does have the advantage that a given morpheme generally occurs within certain syntactic environments (e.g., the morpheme –ing in English generally occurs with verbs). Although it has been noted that morphology can help a child acquire syntax (Morgan et al. 1987), the reverse may also be true. The relationship between morphology and syntax could be beneficial both for discovering bound morphemes and for knowing which words a given bound morpheme can attach to. For instance, -ing might be more readily detected as a suffix when only examining verbs than when examining all words. Additionally, once a child has discovered that –ing can be applied to a particular verb, extending that ending only to other verbs will greatly reduce overgeneralization errors. There is a long history of research for morphology discovery models (e.g., Brent & Cartwright 1996, Goldsmith 2001, Harris 1955). Many of these systems, such as that by Erjavec and Džeroski (2004) are not designed to model child language acquisition, but rather are designed for computational tasks such as parsing a database. Because we are interested in how children acquire morphological forms, only models of language learning will be discussed here. In order to model acquisition of morphological forms by children, an automatic morphology discovery system must have the following characteristics. First, since morphemes must be acquired by the child (i.e., they are highly language specific and thus cannot be innate), any morphology discovery system must use a plausible learning mechanism. This entails not only using information available to the language learner, but also using mechanisms that children possess. Second, because morphemes can appear as (multiple) prefixes, suffixes, and infixes in affixing languages, any morpheme discovery system must have flexibility in terms of the position in the word where the morpheme occurs. Third, it must generate a robust list of morphemes which is minimally sufficient to allow the child to bootstrap into the rest of the morphological system. Finally, given that grammatical morphemes generally occur
منابع مشابه
A Bayesian approach for image denoising in MRI
Magnetic Resonance Imaging (MRI) is a notable medical imaging technique that is based on Nuclear Magnetic Resonance (NMR). MRI is a safe imaging method with high contrast between soft tissues, which made it the most popular imaging technique in clinical applications. MR Imagechr('39')s visual quality plays a vital role in medical diagnostics that can be severely corrupted by existing noise duri...
متن کاملThe role of Persian causative markers in the acquisition of English causative verbs
This project investigates the relationship between lexical semantics and causative morphology in the acquisition of causative/inchoative-related verbs in English as a foreign language by Iranian speakers. Results of translation and picture judgment task show although L2 learners have largely acquired the correct lexico-syntactic classification of verbs in English, they were constrained by ...
متن کاملRanking the Methods of Technology Cross-Border Acquisition, Combining TOPSIS and ANP Approaches for Model Development
Simultaneous with the industries increasing growth, thecompanies must acquire the new technologies to achieve the core competency,survival and improvement and also effectiveness in market. Choosingthe suitable acquisition mode of required technology is one of thecritical strategic decisions in the field of technology management. Dueto the importance of technology acquisition, the main objective...
متن کاملA Survey of Stochastic and Gazetteer Based Approaches for Named Entity Recognition
The task of identifying proper names of people, organizations, locations, or other entities is a subtask of information extraction from natural language documents. This paper presents a survey of techniques and methodologies that are currently being explored to solve this difficult subtask. After a brief review of the difficulties and challenges of the task, as well as a look at previous conven...
متن کاملCombination of Approximation and Simulation Approaches for Distribution Functions in Stochastic Networks
This paper deals with the fundamental problem of estimating the distribution function (df) of the duration of the longest path in the stochastic activity network such as PERT network. First a technique is introduced to reduce variance in Conditional Monte Carlo Sampling (CMCS). Second, based on this technique a new procedure is developed for CMCS. Third, a combined approach of simulation and ap...
متن کاملComparing Experiential Approaches: Structured Language Learning Experiences versus Conversation Partners for Changing Pre-Service Teacher Beliefs
Research has shown that language teachers’ beliefs are often difficult to change through education. Experiential learning may help, but more research is needed to understand how experiential approaches shape perceptions. This study compares two approaches, conversation partners (CONV) and structured language learning experiences (SLLE), integrated into a course in language acquisition. Partici...
متن کامل